Search CORE

12 research outputs found

An original framework for understanding human actions and body language by using deep neural networks

Author: MASSARONI CRISTIANO
Publication venue
Publication date: 28/02/2020
Field of study

The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

Archivio della ricerca- Università di Roma La Sapienza

Adaptive bootstrapping management by keypoint clustering for background initialization

Author: Avola Danilo
Bernardi Marco
Cinque Luigi
Foresti Gian Luca
Massaroni Cristiano
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

The availability of a background model that describes the scene is a prerequisite for many computer vision applications. In several situations, the model cannot be easily generated when the background contains some foreground objects (i.e., bootstrapping problem). In this letter, an Adaptive Bootstrapping Management (ABM) method, based on keypoint clustering, is proposed to model the background on video sequences acquired by mobile and static cameras. First, keypoints are detected on each frame by the A-KAZE feature extractor, then Density-Based Spatial Clustering of Application with Noise (DBSCAN) is used to find keypoint clusters. These clusters represent the candidate regions of foreground elements inside the scene. The ABM method manages the scene changes generated by foreground elements, both in the background model initialization, managing the bootstrapping problem, and in the background model updating. Moreover, it achieves good results with both mobile and static cameras and it requires a small number of frames to initialize the background model

Archivio istituzionale della ricerca - Università degli Studi di Udine

Archivio della ricerca- Università di Roma La Sapienza

Online separation of handwriting from freehand drawing using extreme learning machines

Author: Avola Danilo
Bernardi Marco
Cinque Luigi
Foresti Gian Luca
Massaroni Cristiano
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Online separation between handwriting and freehand drawing is still an active research area in the field of sketch-based interfaces. In the last years, most approaches in this area have been focused on the use of statistical separation methods, which have achieved significant results in terms of performance. More recently, Machine Learning (ML) techniques have proven to be even more effective by treating the separation problem like a classification task. Despite this, also in the use of these techniques several aspects can be still considered open problems, including: 1) the trade-off between separation performance and training time; 2) the separation of handwriting from different types of freehand drawings. To address the just reported drawbacks, in this paper a novel separation algorithm based on a set of original features and an Extreme Learning Machine (ELM) is proposed. Extensive experiments on a wide range of sketched schemes (i.e., text and graphical symbols), more numerous than those usually tested in any key work of the current literature, have highlighted the effectiveness of the proposed approach. Finally, measurements on accuracy and speed of computation, during both training and testing stages, have shown that the ELM can be considered, in this research area, the better choice even if compared with other popular ML techniques

Archivio istituzionale della ricerca - Università degli Studi di Udine

Archivio della ricerca- Università di Roma La Sapienza

A keypoint-based method for background modeling and foreground detection using a PTZ camera

Author: Avola Danilo
Cinque Luigi
Foresti Gian Luca
Massaroni Cristiano
Pannone Daniele
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

The automatic scene analysis is still a topic of great interest in computer vision due to the growing possibilities provided by the increasingly sophisticated optical cameras. The background modeling, including its initialization and its updating, is a crucial aspect that can play a main role in a wide range of application domains, such as vehicle tracking, person re-identification and object recognition. In any case, many challenges still remain partially unsolved, including camera movements (i.e., pan/tilt), scale changes (i.e., zoom-in/zoom-out) and deletion of the initial foreground elements from the background model. This paper describes a method for background modeling and foreground detection able to address all the mentioned challenges. In particular, the proposed method uses a spatio-temporal tracking of sets of keypoints to distinguish the background from the foreground. It analyses these sets by a grid strategy to estimate both camera movements and scale changes. The same sets are also used to construct a panoramic background model and to delete the possible initial foreground elements from it. Experiments carried out on some challenging videos from three different datasets (i.e., PBI, VOT and Airport MotionSeg) demonstrate the effectiveness of the method on PTZ cameras. Other videos from a further dataset (i.e., FBMS) have been used to measure the accuracy of the proposed method with respect to some key works of the current state-of-the-art. Finally, some videos from another dataset (i.e., SBI) have been used to test the method on stationary cameras

Archivio istituzionale della ricerca - Università degli Studi di Udine

Archivio della ricerca- Università di Roma La Sapienza

Exploiting Recurrent Neural Networks and Leap Motion Controller for the Recognition of Sign Language and Semaphoric Hand Gestures

Author: Cristiano Massaroni
Danilo Avola
Gian Luca Foresti
Luigi Cinque
Marco Bernardi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

A Multipurpose Autonomous Robot for Target Recognition in Unknown Environments

Author: Avola Danilo
Cinque Luigi
Lombardi Luca
Luca Foresti Gian
Massaroni Cristiano
Vitale Gabriele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In recent years, the technological improvements of consumer robots, in terms of processing capacity and sensors, are enabling an ever-increasing number of researchers to quickly develop both scale prototypes and alternative low cost solutions. In these contexts, a critical aspect is the design of ad-hoc algorithms according to the features of the available hardware. This paper proposes a prototype of an autonomous robot for mapping unknown environments and recognizing target objects. During the setup phase one or more target objects are shown to the RGB camera of the robot which, for each of them, extracts and stores a set of A-KAZE features. Afterwards, the robot adopts the ultrasonic distance measurement and the RGB stream to map the whole environment and search a set of A-KAZE features matchable with those previously acquired. The paper also reports both preliminary tests carried out on a reference indoor environment and a case study performed in an outdoor one that validate the proposed system

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

Archivio della ricerca- Università di Roma La Sapienza

14th IEEE International Conference on Industrial Informatics, INDIN 2016, Poitiers, France, July 19-21, 2016

Author: Avola Danilo
Cinque Luigi
Foresti G. L.
Lombardi L.
Massaroni Cristiano
Vitale G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Archivio della ricerca- Università di Roma La Sapienza

A new descriptor for Keypoint-Based background modeling

Author: Avola Danilo
Bernardi Marco
Cascio Marco
Cinque Luigi
FORESTI GIAN LUCA
Massaroni Cristiano
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Background modeling is a preliminary task for many computer vision applications, describing static elements of a scene and isolating foreground ones. Defining a robust background model of uncontrolled environments is a current challenge since the model must manage many issues, e.g., moving cameras, dynamic background, bootstrapping, shadows, and illumination changes. Recently, methods based on keypoint clustering have shown remarkable robustness especially in bootstrapping and camera movements, highlighting however limitations in the analysis of dynamic background (i.e., trees blowing in the wind or gushing fountains). In this paper, an innovative combination between the RootSIFT descriptor and an average pooling is proposed in a keypoint clustering method for real-time background modeling and foreground detection. Compared to renowned descriptors, such as A-KAZE, this combination is invariant to small local changes in the scene, thus resulting more robust in dynamic background cases. Results, obtained on experiments carried out on two benchmark datasets, demonstrate how the proposed solution improves the previous keypoint-based models and overcomes several works of the current state-of-the-art

Archivio istituzionale della ricerca - Università degli Studi di Udine

Archivio della ricerca- Università di Roma La Sapienza